confidence region
Covariance-adapting algorithm for semi-bandits with application to sparse rewards
Perrault, Pierre, Perchet, Vianney, Valko, Michal
We investigate stochastic combinatorial semi-bandits, where the entire joint distribution of outcomes impacts the complexity of the problem instance (unlike in the standard bandits). Typical distributions considered depend on specific parameter values, whose prior knowledge is required in theory but quite difficult to estimate in practice; an example is the commonly assumed sub-Gaussian family. We alleviate this issue by instead considering a new general family of sub-exponential distributions, which contains bounded and Gaussian ones. We prove a new lower bound on the expected regret on this family, that is parameterized by the unknown covariance matrix of outcomes, a tighter quantity than the sub-Gaussian matrix. We then construct an algorithm that uses covariance estimates, and provide a tight asymptotic analysis of the regret. Finally, we apply and extend our results to the family of sparse outcomes, which has applications in many recommender systems.
- Europe > Spain > Canary Islands (0.04)
- North America > United States > California (0.04)
- Europe > Iceland > Capital Region > Reykjavik (0.04)
- Europe > France > Hauts-de-France > Pas-de-Calais (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.05)
- Asia > China > Hong Kong (0.04)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Data Science > Data Mining > Big Data (0.52)
- North America > United States (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Serbia > Vojvodina > South Bačka District > Novi Sad (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.92)
- Overview (0.67)
- Health & Medicine (0.67)
- Education (0.46)
- Information Technology (0.45)
- Europe > Sweden > Uppsala County > Uppsala (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > New Jersey (0.04)
- (2 more...)
- North America > United States > New Hampshire (0.04)
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Data Science (0.97)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Virginia > Arlington County > Arlington (0.04)
- Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
- Asia > Japan > Honshū > Tōhoku (0.04)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- North America > United States (0.04)
- Asia > China > Beijing > Beijing (0.04)